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Abstract 

The  problem  of  estimating  rigid  motion  from  projections  may  be  characterized  using  a  non¬ 
linear  dynamical  system,  composed  of  the  rigid  motion  transformation  and  the  perspective  map. 
The  time  derivative  of  the  output  of  such  a  system,  which  is  also  called  the  “motion  field” ,  is 
bilinear  in  the  motion  parameters,  and  may  be  used  to  specify  a  subspace  constraint  on  either 
the  direction  of  translation  or  the  inverse  depth  of  the  observed  points.  Estimating  motion 
may  then  be  formulated  as  an  optimization  task  constrained  on  such  a  subspace.  Heeger  and 
Jepson  [5],  who  first  introduced  this  constraint,  solve  the  optimization  task  using  an  extensive 
search  over  the  possible  directions  of  translation. 

We  reformulate  the  optimization  problem  in  a  systems  theoretic  framework  as  the  the  iden¬ 
tification  of  a  dynamic  system  in  exterior  differential  form  with  parameters  on  a  differentiable 
manifold,  and  use  techniques  which  pertain  to  nonlinear  estimation  and  identification  theory 
to  perform  the  optimization  task  in  a  principled  manner.  The  general  technique  for  addressing 
such  identification  problems  [14]  has  been  used  successfully  in  addressing  other  problems  in 
computational  vision  [13,  12]. 

The  application  of  the  general  method  [14]  results  in  a  recursive  and  pseudo-optimal  solution 
of  the  motion  problem,  which  has  robustness  properties  far  superior  to  other  existing  techniques 
we  have  implemented. 

By  releasing  the  constraint  that  the  visible  points  lie  in  front  of  the  observer,  we  may  explain 
some  psychophysical  effects  on  the  nonrigid  percept  of  rigidly  moving  shapes. 

Experiments  on  real  and  synthetic  image  sequences  show  very  promising  results  in  terms  of 
robustness,  accuracy  and  computational  efficiency. 


1  Visual  motion  estimation  from  a  dynamic  mode! 

Let  a  scene  be  represented  by  a  set  of  N  feature  points  in  3D  space  moving  rigidly  with  respect 
to  the  viewer;  the  visual  motion  problem  is  defined  by  the  rigidity  constraint  and  the  perspective 

"Research  funded  by  the  California  Institute  of  Technology,  ONR  grant  N00014-93-1-0990  and  an  AT&T  Founda¬ 
tion  Special  Purpose  grant.  This  work  is  registered  as  CDS  Technical  Report  CIT-CDS  94-005,  California  Institute 
of  Technology,  January  1994  -  revised  February  1994 
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projection  equations.  If  X;  =  [Xj  Yi  are  the  coordinates  of  the  point  and  =  [a;,-  yi]'^  the 
corresponding  projection,  we  may  write 


rXi  =  flAXi  +  F  X(0)  =  Xo  .. 

I  Xi  =  7r(Xi)  +  n,-  \/i=  1:  N  ^  ‘ 

where  rii  represents  an  error  in  measuring  the  position  of  the  projection  of  the  point  i  and  tt 
represents  an  ideal  perspective  projection^.  Solving  the  visual  motion  problem  consists  in  estimating 
V,  n  from  all  the  visible  points,  i.e.  reconstructing  the  input  of  the  above  system  from  its  noisy 
output.  We  show  that  it  is  possible  to  invert  the  above  system  using  a  technique  which  has  been 
recently  introduced  in  [14]  for  identifying  systems  in  Exterior  Differential  form  [2]  with  parameters 
on  a  topological  manifold. 

The  scheme  is  motivated  by  the  work  of  Heeger  and  Jepson  [6,  5]  and  may  be  considered  as  a 
recursive  solution  of  their  task  using  methods  which  pertain  to  the  field  of  nonlinear  estimation  and 
identification  theory.  As  a  result,  the  minimization  task  which  is  the  core  of  the  subspace  methods 
for  recovering  rigid  motion,  needs  not  to  be  performed  by  brute  force  search,  as  it  is  done  in  [5]. 
Instead,  an  Implicit  Extended  Kalman  Filter  (lEKF)  [9,  3,  8,  14]  is  in  charge  of  estimating  the 
motion  parameters  recursively  according  to  nonlinear  prediction  error  criteria  (for  an  introductory 
treatment  of  Prediction  Error  Methods  (PEM)  in  a  linear  context,  see  for  example  [16]).  As  an 
effect,  our  method  exploits  in  a  pseudo-optimal  manner  the  information  coming  from  a  long  stream 
of  images,  making  the  scheme  robust  and  computationally  efficient. 

2  Motion  reconstruction  via  (least  squares)  inversion  constrained 
on  subspaces 

Consider  the  following  expression  of  the  “motion  field”,  i.e.  the  first  derivative  of  the  output  of  the 
model  (1). 


where 


x(t)  = 


n 


-4(x)  I  5(x) 


m 

D(t) 


(2) 


A  = 


1  0  -X 

0  1  -y 


B  = 


—xy  1  +  —y 

—  1  —  xy  X 


(3) 


If  we  observe  enough  points,  we  have  an  overdetermined 
parameters  in  a  least-squares  sense.  CaU  Ci  =  ^^Ai  Bi 


system  which  we  may  solve  for  the  motion 
,  we  have 


r  •  1 

t 

’  Xi 

■  V{t)  ■ 

X2 

h{t) 

Cl 

.  . 

=  C^x 


where  the  symbol  f  denotes  the  pseudo-inverse.  Note  that  Ci  depends  on  the  depth  of  the  point 
Zi,  which  we  do  not  know.  By  substituting  the  above  expression  into  eq.  (2),  we  have  an 
constraint  on  Zi  [5],  which  consists  of  imposing  that  x  is  the  null  space  of  the  orthogonal  complement 


^More  articulated  camera  models  may  be  employed.  However,  we  do  not  address  them  here. 
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of  the  range  of  C.  We  may  try  to  approximate  this  constraint  in  a  least  squares  sense  by  solving 
w.r.t  Zi  the  following  nonlinear  optimization  problem: 

Zi  =  arg  min  ||x  —  C{Z) 

Zi 

i.e.  we  are  looking  for  Zi^  i  =  1 ..  .n  such  that  x  is  the  null  space  of  the  orthogonal  complement  of 
the  range  of  C.  If  C  was  invertible,  the  above  constraint  would  be  satisfied  trivially  for  all  motions. 
However,  when  2JV  >  3,  CC^  has  rank  at  most  three,  and  hence  (J  -  CC^)  ^  0. 


V 

a 


(4) 


2.1  Recovery  of  direction  of  translation 


Note  that  the  minimization  described  above  is  performed  with  respect  to  the  depth  of  each  point 
in  space.  However,  the  role  of  depth  and  translation  may  be  interchanged,  as  it  is  evident  from  the 
structure  of  the  matrix  C.  We  may  therefore  “pseudo-invert”  the  system  above  with  respect  to  depth 
and  rotation  and  then  perform  the  minimization  with  respect  to  the  direction  of  translation  in 

r  1  1 


(the  two-sphere  of  radius  one).  For  each  point  i  we  have  Xi(t)  =  [Ai{xi)V (6 ,  (j))  |  Hi(x)] 


Z{t)i 


where  F  G  is  represented  in  local  coordinates  as  V{d,4>).  If  we  observe  N  points  we  may  write 
X  =  C{0,  <^)[^, .  •  • ,  where 


r 


Hi  ■ 

AnV  Bn 


Now,  proceeding  in  a  similar  way  as  before,  we  could  seek  for  6,(f>  which  satisfy  the  following 
subspace  algebraic  constraint: 


/ 


X  =  0. 


(5) 


Note  that  we  are  trying  to  “adapt”  the  orthogonal  complement  of  C,  which  is  highly  structured 
as  a  function  of  9,(l>,  until  a  given  vector  x  is  its  null  space.  Heeger  and  Jepson  [5]  solve  this 
problem  by  minimizing  the  two-norm  of  the  above  constraint  using  a  extensive  search  over  0,  (j),  or 
a  samphng  of  the  sphere.  This  procedure  does  not  exploit  any  of  the  geometric  structure  of  the 
problem.  Furthermore  it  does  not  take  into  account  the  measurement  noise,  which  enters  into  the 
minimization  in  a  highly  structured  fashion,  and  is  computationally  expensive.  Temporal  coherence 
of  motion  is  also  not  taken  into  account:  at  each  step  we  want  to  exploit  all  the  processing  performed 
at  the  previous  time  instant  and  update  recursively  the  motion  estimates. 

The  method  for  performing  the  minimization  task  described  above  in  a  principled  way  is  pre¬ 
sented  in  section  3:  we  rephrase  the  problem  as  the  identification  of  an  exterior  differential  system 
with  parameters  on  the  two-sphere.  The  method  outputs  motion  estimates  together  with  their 
reliability  in  the  form  of  the  second  order  statistics  of  the  estimation  error.  Such  an  error  may  be 
used  in  subsequent  modules  for  estimating  structure. 


2 . 2  Recovery  of  the  mean  distance 

In  many  appUcations  it  is  of  interest  to  estimate  the  average  distance  of  an  object  from  the  camera 
(position  of  the  centroid).  For  this  case,  it  is  sufficient  to  consider  the  minimization  in  eq.  (4)  when 
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Zi  =  Zc'ii\  Zc  is  the  distance  of  the  centroid.  The  solution  to  such  a  problem  is  analogous  to  the 
recovery  of  translation,  and  will  be  presented  in  section  3. 

2.3  Recovery  of  rotation  and  depth 

Once  the  direction  of  translation  has  been  recovered,  we  may  derive  the  rotational  velocity  and 
inverse  depth  in  a  least-squares  fashion  from  their  definition: 

X. 

^1 

1 

Zn 

n 

The  motion  estimates  may  be  fed,  together  with  the  variance  of  their  estimation  error,  into  a 
recursive  structure  from  motion  module  which  processes  motion  error,  such  as  for  example  [11,  15]. 

3  Solving  the  subspace  optimization  via  identifying  an  exterior 
differential  system  with  parameters  on  a  differentiable  mani¬ 
fold 

Let  us  define  a  =  [0,4>]'^'j  x;  are  measured  up  to  some  error,  ji  =  x*-  +  rii  Ui  £  A/’(0,-R„-),  which 
induces  an  error  in  the  derivative:  jif  =  x,-  -|-  Uif.  Call  x  the  column  vector  obtained  by  stacking  the 

components  i,  similarly  with  x.  Now  define  C'‘'(x,  a)  =  I  —  C  (C^C)  C^\ .  Finally  the  subspace 

L  '  1^?^  J 

constraint  (5)  may  be  written  as  C-‘-(x,a)x  =  0.  Now 

C^(x,a)x  =  0  V{a)  £ 

Yi  =  Xi  +  Hi  Vi 

represents  a  system  in  Exterior  Differential  Form.  Solving  for  translation  is  equivalent  to  identifying 
the  above  exterior  differential  system  with  parameters  on  a  differentiable  manifold  (the  sphere  in 
this  case)  from  the  noisy  data  y. 

We  have  addressed  this  problem  using  the  general  methods  presented  in  [14].  The  solution  is 
given  by  the  simple  iteration 

Prediction  step 

a(t  +  l|f)  =  a{t\t)  q;(0|0)  =  oq 
P{t  +  lit)  =  P{t\t)  +  RS)  ^(0|0)  =  Po 

Update  step 

rd(t  +  i|f-t-_i)  =  d(f-kilf)+ 

L{t  +  l)C^(y{t),  a{t  -f  l|t))y/ 

<  P(t  l|f  •+•  1)  = 

r(f  +  i)P(t-kilor^(*+i)+ 

Lit  +  i)D+(t)RM + mim^jt  -I- 1) 
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where 


r  Lit  +  1)  =  Fit  +  l\t)C^^it  +  +  1) 

)  A(/  +  1)  =  C^(t  +  l)Pit  +  +  1)  +  D+it)Rnit  +  l)Dlit) 

I  Tit  +  l)  =  I  -  L(ti-l)C^(t  +  l) 

{  D+it)  =  (ax(<T)|y(^) 
as  from  the  appendix  A  of  [14]. 

3.1  Observability /identifiability  of  the  method 

In  order  to  be  able  to  assess  the  convergence  of  the  above  .scheme,  we  must  prove  its  observabil¬ 
ity /identifiability.  It  can  be  shown,  using  the  analysis  of  Heeger  and  Jepson  [7],  that  the  scheme 
described  above  is  identifiable  under  general  position  conditions.  Note  that  the  analysis  in  [7]  is 
carried  out  with  different  motivations;  however,  when  the  results  are  cast  into  the  above  estima¬ 
tion/identification  framework,  they  allow  inferring  the  identifiability  of  the  method. 

3.2  Enforcing  rigid  motion:  the  positive  depth  constraint 

When  estimating  motion  from  visible  points,  we  must  enforce  the  fact  that  the  measured  points 
are  in  front  of  the  observer.  This  may  be  easily  done  in  the  prediction  step  by  computing  the  mean 
distance  of  the  centroid,  as  indicated  above,  and  checking  whether  it  is  positive.  If  it  is  not,  the 
prediction  is  reflected  on  the  sphere  (the  diametral  point  of  the  state  space  sphere  is  chosen  as  the 
prediction). 

When  we  do  not  impose  such  a  constraint,  the  filter  may  converge  to  a  rigid  motion  which 
corresponds  to  points  moving  behind  the  observer,  and  are  therefore  not  physically  realizable. 
However,  if  we  allow  such  condition  to  happen  by  releasing  the  positive  depth  constraint,  and  then 
feed  the  estimate  to  a  structure  estimation  step,  such  as  for  example  a  simple  Extended  Kalman 
Filter  [10,  11,  15]  initialized  with  points  at  positive  depth,  the  result  is  a  rubbery  interpretation  of 
structure  which  has  been  observed  also  in  psychophysical  experiments. 

The  geometric  interpretation  of  the  rubbery  percept  is  illustrated  in  figure  1.  Note  that  both 
affine  3D  motion  and  similarity  transformations  viewed  under  projection  admit  a  geometric  in¬ 
variant,  which  is  the  absolute  conic  [4].  On  the  contrary  the  orientation  (determinant  of  the 
transformation)  is  not  invariant  under  projection. 

4  Experimental  assessment 

We  have  experimented  the  scheme  on  real  and  noisy  synthetic  image  sequences.  For  the  same  data 
set  used  in  [15],  the  scheme  proves  far  more  robust  to  the  effect  of  measurement  noise.  Convergence 
is  reached  from  arbitrary  initial  condition  and  noise  in  the  image  plane  coordinates  up  to  10  pixel 
std.  The  scheme  works  also  with  higher  noise  levels  when  properly  initialized, 

The  estimate  of  the  two  components  of  the  direction  of  translation  with  8  pixel  std  noise  is 
shown  in  figure  4,  together  with  the  estimation  error.  An  estimate  for  more  usual  error  levels  (one 
pixel  std)  is  reported  in  figure  3.  In  both  cases  the  positive  depth  constraint  has  been  enforced. 
The  estimates  of  rotational  velocity  are  plotted  in  figure  4. 

A  typical  plot  of  the  residual  function,  which  is  the"  value  of  the  constraint  (5)  aha  function  of 
B,  0,  is  shown  in  figure  5.  The  bright  area  indicate  a  small  residual  value.  The  black  asterisk  indi¬ 
cates  the  position  of  the  motion  (in  the  local  coordinates  of  the  sphere  of  directions  of  translation) 
which  generated  the  residual. 
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RIGID  PERCEPT 


RUBBERY"  PERCEPT 


Figure  1:  Geometric  interpretation  of  the  “rubbery”  perception;  motion  is  estimated  without 
imposing  the  positive  depth  constraint;  this  may  result  in  a  motion  estimate  which  is  compatible 
with  a  rigid  structure  interpretation  behind  the  observer.  Once  such  a  structure  is  reflected  in 
front  of  the  observer,  it  gives  rise  to  the  perception  of  a  rubbery  structure  rotating  in  the  opposite 
direction  of  the  true  one. 


Drection  of  translation;  8  {mxoI  std  noise 


Error  in  ttie  cftrection  of  translation:  a  pixel  std  noise 


Figure  2:  (Left)  Estimates  of  the  two  components  of  the  direction  of  translation.  The  error  in  the 
image  plane  measurements  had  8  pixel  standard  deviation.  The  initial  conditions  were  zero  for 
both  components.  The  ground  truth  is  in  dotted  lines  (Right)  Estimation  error  for  the  direction  of 
translation.  With  noise  of  8  pixel-  std  in-the-data,  the  estimates -are  still  within  10  %-of-the  true 
value.  The  positive  depth  constraint  has  been  enforced. 


Direction  of  translation:  Ipixei  std  noise  Error  in  the  direction  of  trsarslalion:  l  pixel  sld  noise 


Figure  3:  Estimates  and  errors  for  the  direction  of  translation  when  the  noise  in  the  image  plane 
has  a  standard  deviation  of  1  pixel  (according  to  the  performance  of  common  optical  flow/feature 
tracking  schemes.  Note  that  convergence  is  reached  from  zero  initial  condition  in  about  10  steps. 


Figure  4:  Estimates  for  the  components  of  rotational  velocity 
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Ftosi<kid  fimction:  theia  =  1.084,  =  -0.6478  .Omega  s  -04-0 ,  noise  =  0  Residual  function:  theta  =  -1.571,  phi  =  0.04363  .Omega  =  0.0872700 ,  noise  s  0 


Figure  5:  Plots  of  the  residual  function.  The  local  coordinates  of  the  sphere  of  directions  of 
translation  is  plotted.  Bright  regions  denote  small  residuals.  The  black  asterisk  is  the  “true” 
motion  which  generated  the  residual.  Note  that  for  small  rotation  (left)  the  minimum  of  the  residual 
coincides  with  the  true  motion.  When  translation  is  large  (right)  the  Euler  step  approximation  is 
no  longer  vaUd,  and  the  minimum  moves  from  the  true  location. 


It  is  noted  that  the  minimum  of  the  residual  is  displaced  from  the  true  motion  when  the  norm 
of  the  rotational  velocity  is  large.  This  is  due  to  the  fact  that  we  approximate  the  velocity  of 
the  projected  points  (motion  field)  with  first  differences;  the  approximation  is  good  as  long  as 
R  =  =  /  +  ft  A,  i.e.  as  long  as  the  norm  of  translation  is  small. 

Note  the  presence  of  local  minima,  as  it  may  be  seen  from  the  mesh  plot  of  the  cost  function 
(figure  6).  The  filter  may  temporarily  enter  into  one  of  these.  In  figure  4  (right)  we  show  the 
temporary  convergence  of  the  filter  to  a  local  minimum.  In  figure  4  (left)  we  show  the  convergence 
to  the  “rubbery  motion  interpretation”  when  the  positive  depth  constraint  is  released. 

In  the  following  figure  8  we  show  the  convergence  of  the  filter  to  the  rubbery  interpretation 
(left)  and  rigid  motion  (right)  plotted  on  the  image  of  the  cost  function.  When  the  positive 
depth  constraint  is  not  enforced  the  filter  may  converge  either  to  the  rigid  or  to  the  rubbery 
interpretation  (figure  9  left).  However,  when  imposing  the  positive  depth  constraint,  the  estimate 
is  reflected  onto  the  correct  rigid  interpretation  (figure  9  right,  see  also  figure  10  right  for  the  state 
estimates). 

When  we  feed  the  motion  estimation  to  a  structure  from  motion  module  estimating  motion 
error,  and  initialized  with  points  at  positive  depth,  we  may  observe  either  a  rigid  set  of  points 
which  move  according  to  the  correct  motion  (a  top  view  of  the  points  is  shown  in  figure  12  left)  or 
to  a  “rubbery”  percept  (figure  12  right).  This  is  in  accordance  with  the  experience  in  psychophysical 
experiments. 

4.1  Comparison  with  the  essential  filter 

The  filter  proposed  in  this  paper  proves  far  less  sensitive  to  noise  in  the  measurements  and  to  the 
initial  conditions  when  compared  to  the  essential  filter  [13].  In  particular,  for  20  observed  points, 
the  essential  filter  converges  for  initial  conditions  within  30  %,  while  the  subspace  filter  converges 
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Figure  6:  (Left)  Mesh  plots  of  some  typical  residual  functions;  note  the  presence  of  local  minima. 


Diredion  d  Irarwlation  Directo)  d  transtetioo 


Figure  7:  (Left)  convergence  to  the  local  minimum  corresponding  to  the  rubbery  interpretation 
when  the  positive  depth  constraint  is  not  enforced.  (Right)  convergence  to  a  local  minimum  and 
then  to  the  correct  rigid  motion  when  the  positive  depth  constraint  is  enforced. 


9 


Convergence  to  the  rufabery  interpretailot)  Convergence  to  ^  correct  Merpretatlon 


Figure  8:  Convergence  to  the  “rubbery  interpretation”  (left)  versus  convergence  to  the  rigid  motion 
interpretation 


Figure  9:  (Left)  Convergence  when  not  imposing  the  positive  depth  constraint:  the  filter  may 
converge  to  either  the  correct  rigid  interpretation  (top)  or  to  the  local  minimum  corresponding  the 
“rubbery’Mnterpretation-(bot-tomV  Howeverrwhenrfmposing-fhe^positive-dept-h-c0BStFaint-(-pight), 
the  filter  only  converges  to  the  correct  rigid  motion  interpretation. 
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Otfdction  of  translation 


Figure  11:  Comparison  of  the  contour  plots  of  the  residual  function  for  the  subspace  filter  (left) 
and  the  essential  filter  (right).  The  slice  of  the  residual  surface  is  plotted  along  the  two  dimensions 
in  the  coordinates  of  the  minimum.  Note  that  the  essential  filter  seems  to  have  a  simpler  residual, 
with  no  local  minima  except  for  the  one  corresponding  to  the  rubbery  structure.  However,  for  the 
essential  filter  this  is  only  a  two-dimensional  slice  oTfhe  more  complicate  residual  which  is  on  a 
five-dimensional  space. 
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Figure  12:  Convergence  of  a  structure  from  motion  module  to  a  rigid  interpretation  of  structure 
(left)  or  to  a  rubbery  object  rotating  in  the  opposite  direction  (right).  The  plots  show  a  top  view 
of  the  points,  with  the  image  plane  on  the  low  end. 

from  any  initial  condition.  The  essential  filter  is  faster  in  converging,  reaching  regime  in  5  to  15 
frames,  while  the  subspace  filter  takes  on  average  10-20  frames.  However,  the  essential  filter  is  more 
sensitive  to  noise,  and  the  subspace  filter  may  tolerate  up  to  5  times  more  error  on  the  measured 
image  plane  coordinates,  i.e.  up  to  more  that  10  pixel  std.  The  contour  plots  of  the  residual 
function  for  the  subspace  filter  and  the  essential  filter  may  be  compared  in  figure  11.  Note  that  for 
the  essential  filter  only  a  two-dimensional  slice  of  the  five- dimensional  residual  is  plotted. 

In  the  essential  filter  the  positive  depth  constraint  is  encoded  directly  in  the  definition  of  the 
state  space  manifold  (the  essential  manifold).  The  convergence  of  the  essential  filter  is  illustrated  in 
fig.  13:  on  the  left  the  convergence  is  shown  when  starting  from  the  rubbery  motion  interpretation 
and  imposing  positive  depth.  On  the  right  the  positive  depth  constraint  has  been  released  (equiva¬ 
lently  reflections  are  allowed  in  the  essential  manifold),  and  therefore  we  may  observe  occasionally 
convergence  to  the  local  minimum  corresponding  to  the  rubbery  interpretation. 

4.2  Experiments  with  real  image  sequences 

We  have  tested  the  scheme  on  real  image  sequences:  the  noise  level  achieved  by  the  most 
common  feature  tracking/optical  flow  techniques  is  easily  handled  by  the  filter.  As  an  example  we 
report  here  the  filter  estimates  for  the  rocket  scene,  for  comparison  with  [13].  Due  to  the  fact  that 
the  filter  takes  about  20  frames  to  converge,  we  have  doubled  the  sequence  and  used  one  run  as 
initial  condition  for  the  second  run,  which  is  displayed  in  image  14.  We  are  in  the  process  of  testing 
the  scheme  on  longer  image  sequences. 

5  Conclusions 

We  have  formulated  a  new  recursive  scheme  for  estimating  rigid  motion  under  perspective  via 
identifying  a  dynamic  model  in  exterior  differential  form.  The  motivation  comes  from  Heeger  and 
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C<xivdrgdnce  o<  lh«  Essenii^  tier  when  imposing  positive  depth  Essertial  tier:  convergence  when  not  erasing  positive  depth 


Figure  13:  Convergence  of  the  essential  filter:  (left)  the  filter  imposes  automatically  the  positive 
depth  constraint;  even  starting  from  the  rubbery  state,  the  filter  switches  to  the  correct  estimate. 
(Left)  Releasing  the  positive  depth  constraint,  it  is  possible  for  the  filter  to  converge  to  the  rubbery 
interpretation. 


Figure  14:  (Left)  Estimate  of  the  direction  otlxanslatioiLfbr  the  rocket  scene.  (Right)  One  image 
of  the  rocket  scene. 
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Jepson  [5],  who  propose  to  view  motion  estimation  as  an  optimization  problem  constrained  to 
subspace.  They  solve  the  minimization  by  extensive  search. 

Using  results  from  nonlinear  estimation  and  identification  theory,  we  formulate  a  motion  esti¬ 
mator  which  is  fast,  computationally  efficient,  accurate  and  more  robust  than  any  recursive  motion 
estimation  scheme  we  have  implemented.  Extensive  experiments  have  been  performed  that  high¬ 
light  such  features. 
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